Gradient Descent Converges to Minimizers
نویسندگان
چکیده
We show that gradient descent converges to a local minimizer, almost surely with random initialization. This is proved by applying the Stable Manifold Theorem from dynamical systems theory.
منابع مشابه
Gradient Descent Only Converges to Minimizers
We show that gradient descent converges to a local minimizer, almost surely with random initialization. This is proved by applying the Stable Manifold Theorem from dynamical systems theory.
متن کاملGradient Descent Only Converges to Minimizers: Non-Isolated Critical Points and Invariant Regions
We prove that the set of initial conditions so that gradient descent converges to strict saddle points has (Lebesgue) measure zero, even for non-isolated critical points, answering an open question in [1].
متن کاملStabilizing Adversarial Nets with Prediction Methods
Adversarial neural networks solve many important problems in data science, but are notoriously difficult to train. These difficulties come from the fact that optimal weights for adversarial nets correspond to saddle points, and not minimizers, of the loss function. The alternating stochastic gradient methods typically used for such problems do not reliably converge to saddle points, and when co...
متن کاملStabilizing Adversarial Nets with Prediction Methods
Adversarial neural networks solve many important problems in data science, but are notoriously difficult to train. These difficulties come from the fact that optimal weights for adversarial nets correspond to saddle points, and not minimizers, of the loss function. The alternating stochastic gradient methods typically used for such problems do not reliably converge to saddle points, and when co...
متن کاملStabilizing Adversarial Nets With Prediction Methods
Adversarial neural networks solve many important problems in data science, but are notoriously difficult to train. These difficulties come from the fact that optimal weights for adversarial nets correspond to saddle points, and not minimizers, of the loss function. The alternating stochastic gradient methods typically used for such problems do not reliably converge to saddle points, and when co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1602.04915 شماره
صفحات -
تاریخ انتشار 2016